🌻 Causal mapping as causal QDA

Abstract#

This paper argues that causal mapping can be treated as a serious form of Qualitative Data Analysis (QDA): a disciplined variant in which the primary coding act is not “apply a theme”, but code a causal link (an ordered pair of factor labels) grounded in a quote and source. The resulting dataset is a structured, auditable qualitative model (a network of causal claims) that can be queried using a transparent library of operations (filtering, path tracing, label transforms, bundling). This gives researchers a way to keep qualitative judgement central while making key intermediate products more reproducible, checkable, and scalable—especially when using AI as a constrained, low-level coding assistant rather than a black-box analyst.

This paper is written for QDA/CAQDAS users who want: (i) a clear definition of causal coding as a qualitative method, (ii) an account of why it can be more reliable than open “theme finding”, (iii) a transparent model of how causal-mapping outputs support analysis, and (iv) a realistic positioning relative to neighbouring approaches (thematic analysis, qualitative content analysis, conversational AI analysis).

Unique contribution (what this paper adds):

See also:

2. Why causal coding is often easier (and more checkable) than “find the themes”#

The instruction “find the main themes” is (legitimately) open-ended: it depends on theoretical stance, positionality, research question, and the analyst’s preferred granularity. That openness is often a feature of “Big-Q” qualitative work; but it makes systematic comparison and scale difficult (e.g. thematic analysis (n.d.)).

This difference is also visible when people use generative AI. “List the main themes in this document” can be a useful time-saver, but it is massively sensitive to what one means by theme (and to the analyst’s implicit theory of what “matters”). You can narrow the prompt (“Identify the main kinds of relationship issues mentioned”), but at that point you are already moving from open generation towards a more constrained extraction task.

The causal coding task is narrower:

Identify each passage where the text says that one thing influenced another, and record what influenced what.

This does not remove judgement (labels still matter; causal language can be ambiguous), but it reduces degrees of freedom at the point of coding. In practice, that usually improves:

This is a “small-Q” move: it is not a claim that causal QDA replaces interpretive qualitative work, but that it is a useful, rigorous option when the research questions are themselves causal (which they often are in evaluation and applied social research).

Consider:

“After the clinic started opening on Saturdays, I didn’t have to miss work, so I could actually attend.”

A “theme finding” pass might code: Access, Clinic opening hours, Employment constraints, Attendance.

A causal-coding pass would typically try to capture the explicit influence structure:

Both can be valuable, but the causal representation is immediately queryable as a mechanism (and can be checked line-by-line against quotes).


3. The output is not “just codes”: it is a queryable qualitative model#

Ordinary QDA typically culminates in a narrative account plus some supporting tables. Causal QDA yields a different primary object: a network of causal claims.

Once you have a links table, you automatically have a graph:

That graph is a qualitative model in a specific sense:

The payoff is that you can answer many questions by querying this model—without needing to ask an AI to produce a global synthesis, and without hiding methodological steps inside the analyst’s head.


4. A transparent “library of operations” (filters + views)#

Causal mapping analysis is often best described as a pipeline: pass the links table through a sequence of operations, then render a view (map/table).

At a high level these operations fall into:

The crucial methodological point is not which software you use, but that the meaning of a derived map is always:

“This view of the evidence, after these explicit transformations.”

This is what makes the method checkable and extensible: other researchers can replicate the same pipeline on the same links table, or change one step and see what changes.

4.1 Example: a concrete analysis pipeline (from a broad corpus to a specific view)#

Suppose you want: “Mechanisms connecting Training to Adoption, but only for the younger respondents, and only where more than one source supports each link.”

One explicit pipeline could be:

The point is not that this exact pipeline is “right”, but that it is explicit: the reader can see what was done, rerun it, and inspect what evidence is inside the view.


4.2 Causal QDA as “qualitative split‑apply‑combine” (and why this is a good place for AI)#

A useful way to describe this workflow is as a qualitative variant of the split‑apply‑combine strategy: break a hard analytic problem into manageable pieces, operate on each piece consistently, then recombine into a coherent answer ({wickham, 2011).

In causal QDA, the mapping is unusually clean:

This framing also clarifies a practical division of labour when using LLMs:

5. Reproducible ↔ emergent: where causal QDA sits#

Many qualitative traditions sit towards the emergent end of a spectrum: questions, codes, and interpretations are refined through an iterative, interpretive process, and the final output is primarily narrative synthesis.

Causal QDA tends to sit further towards the reproducible end on some dimensions:

This does not mean causal QDA is “objective” or that it removes positionality. It means that a larger portion of the analysis chain becomes explicitly inspectable: anyone can trace from a map edge to the underlying quotes, and from a view to the sequence of operations that produced it.


6. AI in causal QDA: the “low-level assistant”, not the analyst#

Generative AI can be used in at least two ways:

Our stance is the second. The reason is not moral; it is methodological:

Once you have the links table, most analysis steps can be deterministic and transparent (filters, transforms, bundling, path tracing), keeping the core interpretive burden where it belongs: on the human researcher.

6.1 Example: what “constrained” AI assistance looks like (and what it should output)#

For a given transcript chunk, the AI can be instructed to output a list of candidate links, each with:

Example output items might look like:

These are still proposals (humans can edit labels and reject links), but each item is locally auditable.


7. Relationship to neighbouring QDA approaches#

7.1 Thematic analysis / qualitative content analysis#

Causal QDA is compatible with many QDA workflows: it can be used alongside thematic coding (e.g. thematic analysis (n.d.)) or qualitative content analysis (e.g. Mayring’s approach (n.d.)), or as a front-end that captures a causal layer of meaning that is often what evaluators ultimately need (drivers, barriers, mechanisms, pathways).

The key difference is that causal QDA stores relations (ordered pairs), not only categories. That enables subsequent reasoning and querying that is hard to do robustly if you only have unconnected theme tags.

7.2 “Post-coding” conversational analysis with AI#

Some AI-assisted QDA approaches propose moving away from coding into a structured dialogue with an AI, using question lists and documented conversations as the main analytic trace (e.g. conversational analysis with AI) (Friese, 2025). This is an interesting and plausible direction for some kinds of interpretive work.

Causal QDA differs in that it retains a strong “small-Q” intermediate representation: a quote-grounded links table plus deterministic analysis steps. The point is not to rule out hermeneutic/interpretive approaches, but to offer a complementary workflow that keeps intermediate claims explicit and machine-checkable.


8. Practical extensions (brief pointers)#

Two extensions are particularly central in practice:

Both are treated as explicit, auditable label transforms (they rewrite labels, not the underlying evidence):

8.1 Example: magnetisation as a label transform (not a re-interpretation)#

Suppose your raw in-vivo factor labels include:

Magnetisation can map these to a shared magnetic label such as Transport cost barrier, letting you aggregate evidence without deleting the original wording. The original link evidence remains traceable; you are rewriting labels for a particular view.


9. Limits and caveats (non-negotiable)#

9.1 Example: the transitivity trap (why “path tracing” needs constraints)#

Source A says:

Source B says:

It is tempting to infer a “path” Training -> Adoption. But unless you impose (and report) constraints—e.g. thread tracing within-source, or only treating same-source chains as evidence for an indirect mechanism—you risk stitching together a mechanism that no one actually claimed.


10. Conclusion#

If you want a QDA method that is:

then causal mapping can be understood as causal QDA: a serious, checkable member of the QDA family, and a pragmatic bridge between rich narrative evidence and reproducible analysis pipelines.

References

Friese (2025). Conversational Analysis with AI - CA to the Power of AI: Rethinking Coding in Qualitative Analysis. https://doi.org/10.2139/ssrn.5232579.

{wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. https://doi.org/10.18637/jss.v040.i01.